Ophthalmology Science
○ Elsevier BV
All preprints, ranked by how well they match Ophthalmology Science's content profile, based on 20 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Mutisya, F.; Onyango, O.; Sitati, S.; Ilovi, S.; W'mosi, B.; Macharia, P.; Makini, B.; Aluuvala, J.; Onyango, J.; Wanyee, S.
Show abstract
BackgroundRetinopathy of prematurity (ROP) is a leading cause of preventable blindness among preterm infants. Accurate retinal vessel segmentation is crucial for detecting plus disease, which indicates progression to severe ROP. However, manual annotation of vessel masks is laborious and inconsistent, especially in low-resource clinical settings. This study aimed to evaluate a self-supervised vessel extraction pipeline using Frangi-Hessian filtering for automatic pseudo-annotation of unlabeled RetCam and Neo retinal images and to compare its performance against supervised and hybrid deep learning frameworks. MethodsTwo public datasets from the HVDROPDB-BV repository: RetCam_Vessels and Neo_Vessels were utilized. We implemented a three-stage pipeline: automatic self-annotation of unlabeled images through vessel-based mask generation; training of five segmentation architectures--BioSwinFuseNet, UNet, FPN, LinkNet, and SegFormer--under three regimes (GT-only, Self-only, and Hybrid GT+Self); and evaluation using Dice, IoU, sensitivity, specificity, PPV, NPV, F1, and AUC metrics. All models were trained with a topology-aware loss that combined binary cross-entropy and Dice losses with continuity penalties. ResultsHybrid supervision consistently outperformed both GT-only and Self-only training across all architectures. The SegFormer-Hybrid model achieved the highest Dice (0.61) and IoU (0.44), while FPN-Hybrid demonstrated the lowest variance. BioSwinFuseNet-Hybrid showed a 122% relative improvement in Dice compared to its GT-only version. Self-only models learned rudimentary vessel priors but lacked clinical precision. ConclusionsIncorporating self-annotated masks alongside limited ground truth improves segmentation accuracy and vessel continuity. The hybrid paradigm offers a scalable path for developing automated ROP screening tools where expert labeling is limited.
Ofosu Mensah, S.; Neubauer, J.; Ayhan, M. S.; Djoumessi Donteu, K. R.; Koch, L. M.; Uzel, M. M.; Gelisken, F.; Berens, P.
Show abstract
Epiretinal membrane (ERM) is a vitreoretinal interface disease that, if not properly addressed, can lead to vision impairment and negatively affect quality of life. For ERM detection and treatment planning, Optical Coherence Tomography (OCT) has become the primary imaging modality, offering non-invasive, high-resolution cross-sectional imaging of the retina. Deep learning models have also led to good ERM detection performance on OCT images. Nevertheless, most deep learning models cannot be easily understood by clinicians, which limits their acceptance in clinical practice. Post-hoc explanation methods have been utilised to support the uptake of models, albeit, with partial success. In this study, we trained a sparse BagNet model, an inherently interpretable deep learning model, to detect ERM in OCT images. It performed on par with a comparable black-box model and generalised well to external data. In a multitask setting, it also accurately predicted other changes related to the ERM pathophysiology. Through a user study with ophthalmologists, we showed that the visual explanations readily provided by the sparse BagNet model for its decisions are well-aligned with clinical expertise. We propose potential directions for clinical implementation of the sparse BagNet model to guide clinical decisions in practice.
Wang, D. T.; Antonio-Aguirre, B.; Pan, A.; Ruggeri, M. L.; Mehta, S. P.; Smith, C. H.; Guthrie, K. S.; Applegate, C.; Doyle, J. J.; Singh, M. S.
Show abstract
AbstractWhile diagnostic disparities by race and age of onset are reported in inherited retinal diseases, their impact on Stargardt disease (STGD)--a clinically and genetically heterogeneous macular dystrophy--remains unclear. We analyzed 246 STGD patients at a U.S. referral center (2003-2024) who completed genetic testing, comparing laboratory-reported results (lab-GT) with manual, phenotype-integrated reinterpretation (m-GT) of ABCA4 sequencing incorporating updated variant databases and genotype-phenotype correlation. Diagnostic yield and variant burden were assessed by race and age of onset. Positive/likely positive (P/LP) lab-GT was identified in 79% (195), with 78% (191) attributable to ABCA4 (ABCA4-positive lab-GT: 57% [141]). M-GT increased ABCA4-P/LP yield to 91% (224). Black participants had lower ABCA4- positive lab-GT than Whites (55% vs. 73%) and fewer pathogenic variants; on multivariable analysis, Black race (OR 0.34) and later age of onset (OR 0.95/year) independently predicted reduced molecular diagnosis. The disparity by race resolved with P/LP m-GT (89% vs. 90%); by age of onset, yield remained lower in late-onset cases (ABCA4-P/LP lab-GT: 86% early-[[≤]10yrs], 83% intermediate-[11-44yrs], 54% late-onset [[≥]45yrs], improving to 97%, 94%, and 77% after m-GT). Post-test reinterpretation improves diagnostic yield, particularly for Black and late-onset STGD patients, underscoring the value of ancestry-informed interpretation, historical reanalysis, and genotype-phenotype correlation.
Valmaggia, P.; Cattin, P. C.; Sandkuehler, R.; Inglin, N.; Otto, T. P.; Aumann, S.; Teussink, M. M.; Spaide, R. F.; Scholl, H. P. N.; Maloca, P. M.
Show abstract
PurposeOptical coherence tomography (OCT) representations in clinical practice are static and do not allow for a dynamic visualisation and quantification of blood flow. This study aims to present a method to analyse retinal blood flow dynamics using time-resolved structural optical coherence tomography (OCT). MethodsWe developed novel imaging protocols to acquire video-rate time-resolved OCT B-scans (1024 x 496 pixels, 10{degrees} field of view) at four different sensor integration times (integration time of 44.8 s at a nominal A-scan rate of 20 kHz, 22.4 s at 40 kHz, 11.2 s at 85 kHz, 7.24 s at 125 kHz). The vessel centres were manually annotated for each B-scan and surrounding subvolumes were extracted. We used a velocity model based on signal-to-noise ratio (SNR) drops due to fringe washout to calculate blood flow velocity profiles in vessels within five optic disc diameters of the optic disc rim. ResultsTime-resolved dynamic structural OCT revealed pulsatile SNR changes in the analysed vessels and allowed the calculation of potential blood flow velocities at all integration times. Fringe washout was stronger in acquisitions with longer integration times; however, the ratio of the average SNR to the peak SNR inside the vessel was similar across all integration times. ConclusionsWe demonstrated the feasibility of estimating blood flow profiles based on fringe washout analysis, showing pulsatile dynamics in vessels close to the optic nerve head using structural OCT. Time-resolved dynamic OCT has the potential to uncover valuable blood flow information in clinical settings. Commercial relationshipsPV received funding from the Swiss National Science Foundation (Grant 323530_199395), the Janggen-Pohn Stiftung and AlumniMedizin Basel and discloses personal compensation from Heidelberg Engineering GmbH. TPO, SA and MMT are salaried employees of Heidelberg Engineering GmbH, 69115 Heidelberg, Germany. RFS discloses personal compensation from Topcon Medical Systems, Roche, Bayer, Heidelberg Engineering and Genentech. HPNS is supported by the Swiss National Science Foundation (Project funding: "Developing novel outcomes for clinical trials in Stargardt disease using structure/function relationship and deep learning" #310030_201165, and National Center of Competence in Research Molecular Systems Engineering: "NCCR MSE: Molecular Systems Engineering (phase II)" #51NF40-182895), the Wellcome Trust (PINNACLE study), and the Foundation Fighting Blindness Clinical Research Institute (ProgStar study). HPNS is a member of the Scientific Advisory Board of Boehringer Ingelheim Pharma GmbH & Co; Claris Biotherapeutics Inc.; Eluminex Biosciences; Gyroscope Therapeutics Ltd.; Janssen Research & Development, LLC (Johnson & Johnson); Novartis Pharma AG (CORE); Okuvision GmbH; ReVision Therapeutics Inc.; and Saliogen Therapeutics Inc. HPNS is a consultant of: Alnylam Pharmaceuticals Inc.; Gerson Lehrman Group Inc.; Guidepoint Global, LLC; and Intergalactic Therapeutics Inc. HPNS is member of the Data Monitoring and Safety Board/Committee of Belite Bio (CT2019-CTN-04690-1), F. Hoffmann-La Roche Ltd (VELODROME trial, NCT04657289; DIAGRID trial, NCT05126966; HUTONG trial) and member of the Steering Committee of Novo Nordisk (FOCUS trial; NCT03811561). All arrangements have been reviewed and approved by the University of Basel (Universitatsspital Basel, USB) and the Board of Directors of the Institute of Molecular and Clinical Ophthalmology Basel (IOB), in accordance with their conflict-of-interest policies. Compensation is being negotiated and administered as grants by USB, which receives them on its proper accounts. HPNS is co-director of the Institute of Molecular and Clinical Ophthalmology Basel (IOB), which is constituted as a non-profit foundation and receives funding from the University of Basel, the University Hospital Basel, Novartis and the government of Basel-Stadt. PMM is a consultant of Roche and holds intellectual properties for machine learning at MIMO AG and VisionAI, Switzerland. Funding organisations had no influence on the design, performance or evaluation of the current study. The other authors declare no conflict.
Ayhan, M. S.; Faber, H.; Kuehlewein, L.; Inhoffen, W.; Aliyeva, G.; Ziemssen, F.; Berens, P.
Show abstract
PurposeComparison of performance and explainability of a multi-task convolutional deep neuronal network to single-task networks for activity detection in neovascular age-dependent macular degeneration. MethodsFrom n = 70 patients (46 female, 24 male) who attended the University Eye Hospital Tubingen 3762 optical coherence tomography B-scans (right eye: 2011, left eye: 1751) were acquired with Heidelberg Spectralis, Heidelberg, Germany. B-scans were graded by a retina specialist and an ophthalmology resident, and then used to develop a multi-task deep learning model to predict disease activity in neovascular age-related macular degeneration along with the presence of sub- and intraretinal fluid. We used performance metrics for comparison to single-task networks and visualized the DNN-based decision with t-distributed stochastic neighbor embedding and clinically validated saliency mapping techniques. ResultsThe multi-task model surpassed single-task networks in accuracy for activity detection (94.2). Further-more, compared to single-task networks, visualizations via t-distributed stochastic neighbor embedding and saliency maps highlighted that multi-task networks decisions for activity detection in neovascular age-related macular degeneration were highly consistent with the presence of both sub- and intraretinal fluid. ConclusionsMulti-task learning increases the performance of neuronal networks for predicting disease activity, while providing clinicians with an easily accessible decision control, which resembles human reasoning. Translational RelevanceBy improving nAMD activity detection performance and transparency of automated decisions, multi-task DNNs can support the translation of machine learning research into clinical decision support systems for nAMD activity detection.
Bhandari, S. M.; Singh, P.; Arun, N.; Sekimitsu, S.; Raghu, V.; Rauscher, F. G.; Elze, T.; Horn, K.; Kirsten, T.; Scholz, M.; Segre, A. V.; Wiggs, J. L.; Kalpathy-Cramer, J.; Zebardast, N.
Show abstract
Heritability of common eye diseases and ocular traits are relatively high. Here, we develop an automated algorithm to detect genetic relatedness from color fundus photographs (FPs). We estimated the degree of shared ancestry amongst individuals in the UK Biobank using KING software. A convolutional Siamese neural network-based algorithm was trained to output a measure of genetic relatedness using 7224 pairs (3612 related and 3612 unrelated) of FPs. The model achieved high performance for prediction of genetic relatedness; when computed Euclidean distances were used to determine probability of relatedness, the area under the receiver operating characteristic curve (AUROC) for identifying related FPs reached 0.926. We performed external validation of our model using FPs from the LIFE-Adult study and achieved an AUROC of 0.69. An occlusion map indicates that the optic nerve and its surrounding area may be the most predictive of genetic relatedness. We demonstrate that genetic relatedness can be captured from FP features. This approach may be used to uncover novel biomarkers for common ocular diseases.
Yucel, H.; Janjua, K.; Kesim, C.; Hasanreisoglu, M.; Halim, M. S.; Shaik, M. A. S.; Ahmed, M. I.; Olsen-Glittenberg, C.-G.; Esmaeelpou, M.; Chung, V.; Nguyen, Q.; Sepah, Y. J.
Show abstract
PurposeTo introduce and validate Flowdef, an open{square}source MATLAB application that automatically excludes drusen{square} and vessel{square}related projection artifacts and enables region{square}specific quantification of choriocapillaris (CC) flow voids in eyes with age{square}related macular degeneration (AMD). MethodsThirty eyes (10 healthy controls, 10 intermediate AMD, 10 dry AMD with geographic atrophy [GA]) underwent 6 x 6 mm OCT{square}angiography. Flowdef generates exclusion masks for drusen (Otsu threshold x 1.2) and superficial vessels (CLAHE + morphology) and identifies CC flow voids using a "mean - 1 SD" threshold. For GA eyes, an interactive affine registration aligns infrared and OCT en face images to delineate healthy, transition, and atrophic zones. Repeatability (same{square}day test-retest) and inter{square}rater reproducibility were assessed with intraclass correlation coefficients (ICC) and coefficients of variation (CV). ResultsFlowdef processed all 30 eyes without failure. After artifact exclusion, atrophic zones showed the largest mean flow-void area (1078 {micro}m2) while healthy zones exhibited more numerous but smaller voids (mean area 871 {micro}m2). Test-retest repeatability was excellent in GA (ICC 0.92-0.99) and control eyes (ICC 0.95-0.99) and variable in intermediate AMD (ICC 0.08-0.80). Inter-rater CV was <10% for most parameters except mean area in intermediate AMD (CV {approx}200%). ConclusionsFlowdef provides a robust, freely available solution for artifact{square}reduced CC flow{square}void analysis and region{square}specific GA assessment, addressing key limitations of existing methods. Translational RelevanceBy delivering reliable CC metrics with minimal user input, Flowdef can support longitudinal AMD monitoring and accelerate clinical research on emerging therapies.
Zhang, W.; Chotcomwongse, P.; Chen, X.; Chung, F. H. T.; Song, F.; Zhang, X.; He, M.; Shi, D.; Ruamviboonsuk, P.
Show abstract
Fundus angiography, including fundus fluorescein angiography (FFA) and indocyanine green angiography (ICGA), are essential examination tools for visualizing lesions and changes in retinal and choroidal vasculature. However, the interpretation of angiography images is labor-intensive and time-consuming. In response to this, we are organizing the third APTOS competition for automated and interpretable angiographic report generation. For this purpose, we have released the first angiographic dataset, which includes over 50,000 images labeled by retinal specialists. This dataset covers 24 conditions and provides detailed descriptions of the type, location, shape, size and pattern of abnormal fluorescence to enhance interpretability and accessibility. Additionally, we have implemented two baseline methods that achieve an overall score of 7.966 and 7.947 using the classification method and language generation method in the test set, respectively. We anticipate that this initiative will expedite the application of artificial intelligence in automatic report generation, thereby reducing the workload of clinicians and benefiting patients on a broader scale.
Siraz, S.; Kamanda, H.; Gholami, S.; Nabil, A. S.; Ong, S. S. Y.; Alam, M. N.
Show abstract
PurposeTo develop and validate deep learning (DL)-based models for classifying geographic atrophy (GA) subtypes using Optical Coherence Tomography (OCT) scans across four clinical classification tasks. DesignRetrospective comparative study evaluating three DL architectures on OCT data with two experimental approaches. Subjects455 OCT volumes (258 Central GA [CGA], 74 Non-Central GA [NCGA], 123 no GA [NGA]) from 104 patients at Atrium Health Wake Forest Baptist. For GA versus age-related macular degeneration (AMD) classification, we supplemented our dataset with AMD cases from four public repositories. MethodsWe implemented ResNet50, MobileNetV2, and Vision Transformer (ViT-B/16) architectures using two approaches: (1) utilizing all B-scans within each OCT volume and (2) selectively using B-scans containing foveal regions. Models were trained using transfer learning, standardized data augmentation, and patient-level data splitting (70:15:15 ratio) for training, validation, and testing. Main Outcome MeasuresArea under the receiver operating characteristic curve (AUC-ROC), F1 score, and accuracy for each classification task (CGA vs. NCGA, CGA vs. NCGA vs. NGA, GA vs. NGA, and GA vs. other forms of AMD). ResultsViT-B/16 consistently outperformed other architectures across all classification tasks. For CGA versus NCGA classification, ViT-B/16 achieved an AUC-ROC of 0.728{+/-}0.083 and accuracy of 0.831{+/-}0.006 using selective B-scans. In GA versus NGA classification, ViT-B/16 attained an AUC-ROC of 0.950{+/-}0.002 and accuracy of 0.873{+/-}0.012 with selective B-scans. All models demonstrated exceptional performance in distinguishing GA from other AMD forms (AUC-ROC>0.998). For multi-class classification, ViT-B/16 achieved an AUC-ROC of 0.873{+/-}0.003 and accuracy of 0.751{+/-}0.002 using selective B-scans. ConclusionsOur DL approach successfully classifies GA subtypes with clinically relevant accuracy. ViT-B/16 demonstrates superior performance due to its ability to capture spatial relationships between atrophic regions and the foveal center. Focusing on B-scans containing foveal regions improved diagnostic accuracy while reducing computational requirements, better aligning with clinical practice workflows.
Woof, W. A.; de Guimaraes, T. A. C.; Al-Khuzaei, S.; Daich Varela, M.; Shah, M.; Naik, G.; Sen, S.; Bagga, P.; Chan, Y. W.; Mendes, B. S.; Lin, S.; Ghoshal, B.; Liefers, B.; Fu, D. J.; Georgiou, M.; da Silva, A. S.; Nguyen, Q.; Liu, Y.; Fujinami-Yokokawa, Y.; Sumodhee, D.; Furman, J.; Patel, P. J.; Moghul, I.; Moosajee, M.; Sallum, J.; De Silva, S. R.; Lorenz, B.; Herrmann, P.; Holz, F. G.; Fujinami, K.; Webster, A. R.; Mahroo, O. A.; Downes, S. M.; Madhusudhan, S.; Balaskas, K.; Michaelides, M.; Pontikos, N.
Show abstract
PurposeTo quantify spectral-domain optical coherence tomography (SD-OCT) images cross-sectionally and longitudinally in a large cohort of molecularly characterized patients with inherited retinal disease (IRDs) from the UK. DesignRetrospective study of imaging data. ParticipantsPatients with a clinical and molecularly confirmed diagnosis of IRD who have undergone macular SD-OCT imaging at Moorfields Eye Hospital (MEH) between 2011 and 2019. We retrospectively identified 4,240 IRD patients from the MEH database (198 distinct IRD genes), including 69,664 SD-OCT macular volumes. MethodsEight features of interest were defined: retina, fovea, intraretinal cystic spaces (ICS), subretinal fluid (SRF), subretinal hyper-reflective material (SHRM), pigment epithelium detachment (PED), ellipsoid zone loss (EZ-loss) and retinal pigment epithelium loss (RPE-loss). Manual annotations of five b-scans per SD-OCT volume was performed for the retinal features by four graders based on a defined grading protocol. A total of 1,749 b-scans from 360 SD-OCT volumes across 275 patients were annotated for the eight retinal features for training and testing of a neural-network-based segmentation model, AIRDetect-OCT, which was then applied to the entire imaging dataset. Main Outcome MeasuresPerformance of AIRDetect-OCT, comparing to inter-grader agreement was evaluated using Dice score on a held-out dataset. Feature prevalence, volume and area were analysed cross-sectionally and longitudinally. ResultsThe inter-grader Dice score for manual segmentation was [≥]90% for retina, ICS, SRF, SHRM and PED, >77% for both EZ-loss and RPE-loss. Model-grader agreement was >80% for segmentation of retina, ICS, SRF, SHRM, and PED, and >68% for both EZ-loss and RPE-loss. Automatic segmentation was applied to 272,168 b-scans across 7,405 SD-OCT volumes from 3,534 patients encompassing 176 unique genes. Accounting for age, male patients exhibited significantly more EZ-loss (19.6mm2 vs 17.9mm2, p<2.8x10-4) and RPE-loss (7.79mm2 vs 6.15mm2, p<3.2x10-6) than females. RPE-loss was significantly higher in Asian patients than other ethnicities (9.37mm2 vs 7.29mm2, p<0.03). ICS average total volume was largest in RS1 (0.47mm3) and NR2E3 (0.25mm3), SRF in BEST1 (0.21mm3) and PED in EFEMP1 (0.34mm3). BEST1 and PROM1 showed significantly different patterns of EZ-loss (p<10-4) and RPE-loss (p<0.02) comparing the dominant to the recessive forms. Sectoral analysis revealed significantly increased EZ-loss in the inferior quadrant compared to superior quadrant for RHO ({Delta}=-0.414 mm2, p=0.036) and EYS ({Delta}=-0.908 mm2, p=1.5x10-4). In ABCA4 retinopathy, more severe genotypes (group A) were associated with faster progression of EZ-loss (2.80{+/-}0.62 mm2/yr), whilst the p.(Gly1961Glu) variant (group D) was associated with slower progression (0.56 {+/-}0.18 mm2/yr). There were also sex differences within groups with males in group A experiencing significantly faster rates of progression of RPE-loss (2.48 {+/-}1.40 mm2/yr vs 0.87 {+/-}0.62 mm2/yr, p=0.047), but lower rates in groups B, C, and D. ConclusionsAIRDetect-OCT, a novel deep learning algorithm, enables large-scale OCT feature quantification in IRD patients uncovering cross-sectional and longitudinal phenotype correlations with demographic and genotypic parameters.
Kandakji, L.; Liu, S.; Balal, S.; Moghul, I.; Allan, B.; Tuft, S.; Gore, D.; Pontikos, N.
Show abstract
PurposeTo develop a deep learning model - Cornea nnU-Net Extractor (CUNEX) - for full-thickness corneal segmentation of anterior segment optical coherence tomography (AS-OCT) images and evaluate its utility in artificial intelligence (AI) research. MethodsWe trained and evaluated CUNEX using nnU-Net on 600 AS-OCT images (CSO MS-39) from 300 patients: 100 normal, 100 keratoconus (KC), and 100 Fuchs endothelial corneal dystrophy (FECD) eyes. To assess generalizability, we externally validated CUNEX on 1,168 AS-OCT images from an infectious keratitis dataset acquired from a different device (Casia SS-1000). We benchmarked CUNEX against two recent models, CorneaNet and ScLNet. We then applied CUNEX to our dataset of 194,599 scans from 37,499 patients as preprocessing for a classification model evaluating whether segmentation improves AI prediction, including age, sex, and disease staging (KC and FECD). ResultsCUNEX achieved Dice similarity coefficient (DSC) and intersection over union (IoU) scores ranging from 94-95% and 90-99%, respectively, across healthy, KC, and FECD eyes. This was similar to ScLNet (within 3%) but better than CorneaNet (8-35% lower). On external validation, CUNEX maintained high performance (DSC 83%; IoU 71%) while ScLNet (DSC 14%; IoU 8%) and CorneaNet (DSC 16%; IoU 9%) failed to generalize. Unexpectedly, segmentation minimally impacted classification accuracy except for sex prediction, where accuracy dropped from 81 to 68%, suggesting sex-related features may lie outside the cornea. ConclusionCUNEX delivers the first open-source generalizable corneal segmentation model using the latest framework, supporting its use in clinical analysis and AI workflows across diseases and imaging platforms. It is available at https://github.com/lkandakji/CUNEX.
Shrivastava, S.; Thakuria, U.; Kinder, S.; Nebbia, G.; Zebardast, N.; Baxter, S. L.; Xu, B. Y.; Aldeen Alryalat, S. A.; Kahook, M.; Kalpathy-Cramer, J.; Singh, P.
Show abstract
ImportanceGlaucoma, a leading cause of blindness worldwide, depends on accurate optic nerve head assessment, particularly optic disc and cup segmentation, for diagnosis and monitoring. Deep learning (DL) models can automate these measurements, but models trained on smaller, site-specific datasets often fail to generalize. While larger, multi-site datasets help, data privacy concerns limit centralized training. ObjectiveTo evaluate a federated learning (FL) framework with site-specific fine-tuning for optic disc and cup segmentation, aiming to match central model performance while preserving privacy and improving generalizability. DesignComparative evaluation of three different approaches: (1) a central model trained on multi-site data, (2) site-specific local model training (3) standard FL models, against an FL with site-specific fine-tuning. SettingMulticenter study incorporating nine publicly available datasets, representing varied clinical environments, populations, and imaging protocols. Participants5,550 color fundus photographs from at least 917 individuals across nine datasets includingboth routine care and research sources from 7 countries. ExposuresOptic disc and cup segmentationin color fundus photographs using training with local model, central model, standard FL, and FL with site-specific fine-tuning.. Main Outcomes and MeasuresSegmentation accuracy measured by Dice score. Comparisons were labeled as performance "wins" or "losses" based on statistically significant differences via Wilcoxon signed-rank test (P < 0.05). ResultsSite-specific fine-tuning of FL with site-specific fine tuning matched central model performance for cup segmentation across all sites (9/9) and for disc segmentation in most sites (7/9). Compared with site-specific local models, it preserved within-site performance (cup: 9/9; disc: 5/9) while substantially improving cross-site generalizability, achieving significant gains in 54.2% (39/72) of disc and 25.0% (18/72) of cup external-site evaluations, with no significant losses. Compared to standard FL pipelines, site-specific fine-tuning improved performance by 52% for disc and 26% for cup. Conclusions and RelevanceSite-specific fine-tuning within an FL framework effectively personalizes generalized models to local data distributions, achieving central-level performance without data sharing and enhancing cross-site robustness. This approach enables privacy-preserving, scalable AI deployment across heterogeneous clinical settings for reproducible and generalizable glaucoma assessment KEY POINTSO_ST_ABSQuestionC_ST_ABSHow can we train an AI model to segment the optic cup and disc across multiple sites without sharing data, yet achieve performance comparable to a central model trained on pooled datasets? FindingsIn this federated learning (FL) study of 5,550 fundus photographs from nine sites, a site-specific fine-tuning FL strategy matched the central models performance and outperformed other standard FL techniques, with notable gains in cross-site generalizability. MeaningSite-specific fine-tuning effectively personalizes FL models to local data distributions, combining data privacy with robust, generalizable performance.
Ghahramani, G. C.; Brendel, M.; Lin, M.; Chen, Q.; Keenan, T.; Chen, K.; Chew, E.; Lu, Z.; PENG, Y.; Wang, F.
Show abstract
Age-related macular degeneration (AMD) is the leading cause of vision loss. Some patients experience vision loss over a delayed timeframe, others at a rapid pace. Physicians analyze time-of-visit fundus photographs to predict patient risk of developing late-AMD, the most severe form of AMD. Our study hypothesizes that 1) incorporating historical data improves predictive strength of developing late-AMD and 2) state-of-the-art deep-learning techniques extract more predictive image features than clinicians do. We incorporate longitudinal data from the Age-Related Eye Disease Studies and deep-learning extracted image features in survival settings to predict development of late-AMD. To extract image features, we used multi-task learning frameworks to train convolutional neural networks. Our findings show 1) incorporating longitudinal data improves prediction of late-AMD for clinical standard features, but only the current visit is informative when using complex features and 2) "deep-features" are more informative than clinician derived features. We make codes publicly available at https://github.com/bionlplab/AMD_prognosis_amia2021.
Hallam, T. M.; Gardenal, E.; McBlane, F.; Cho, G.; Ferraro, L. L.; Pekle, E.; Lu, D.; Carney, K.; Wenden, C.; Beadsmore, H.; Kaiser, S.; Drage, L.; Haye, T.; Kassem, I.; Rangaswamy, N.; Obeidat, M.; Grosskreutz, C.; Saint-Geniez, M.; Steel, D. H.; MacLaren, R. E.; Ellis, S.; Harris, C. L.; Poor, S.; Jones, A. V.
Show abstract
ObjectiveComplement biomarker analysis in ocular fluid samples from subjects with geographic atrophy (GA) in a Phase I/II clinical trial of subretinal AAV2 complement factor I (CFI; FI) gene therapy, PPY988 (formerly GT005), to understand target pharmacokinetics/pharmacodynamics. Clinical findings were subsequently utilized to investigate the therapeutic dose in an in vitro complement activation assay. Design, setting and participantsBiomarker data were evaluated from 28 subjects in FOCUS, a Phase I/II clinical trial evaluating the safety and efficacy of three ascending doses of PPY988. Main outcomes and measuresVitreous humor (VH), and aqueous humor (AH) from subjects before surgery and at serial timepoints (week 5 or 12, 36, 96) were evaluated for changes in levels of intact complement factors I, B and H (FI, FB, FH) components C3, C4, and C1q and breakdown products (Ba, C3a, C3b/iC3b, C4b) using validated assays and OLINK(R) proteomics. A modified in vitro assay of complement activation modelling VH complement concentrations was used to compare PPY988 potency to the approved intravitreal C3 inhibitor pegcetacoplan (Apellis) and complement Factor H (FH). ResultsAn average 2-fold increase in VH FI was observed post-treatment at week 36 and week 96. This correlated with a marked post-treatment reduction in VH concentration of the FB breakdown product Ba and Ba:FB ratio, but minimal changes in C3a and C3b/iC3b levels. Variable concordance in complement biomarker levels in VH versus AH suggest AH is not a reliable proxy for VH for complement activation. During the experimental comparison of doses, a 2-fold increase of FI achieved in the vitreous had only a minor effect on the complement amplification loop in vitro, indicating limited impact [IC50: 1229nM]. Pegcetacoplan completely blocks C3a generation at concentrations much lower than the estimated trough level for monthly intravitreal injections [IC50: 2nM]. Supplementation with FH in the assay revealed similar potency to pegcetacoplan [IC50: 6nM]. Conclusions and relevancePPY988 subretinal gene therapy may not have provided sufficient FI protein to meaningfully modulate complement activation to slow GA growth. Reviewing VH biomarkers is important for understanding target expression, pathway engagement, and determining optimal dose, thereby informing future clinical development.
Beckwith, A. D.; McNamara, S. M.; Veturi, Y. A.; Manoharan, N.; de Carlo Forest, T. E.; Kinder, S.; Bearce, B.; Gnanaraj, R.; Lynch, A.; Singh, P.; Nebbia, G.; Mandava, N.; Kalpathy-Cramer, J.
Show abstract
We tested whether a Gompertz growth curve better describes and predicts geographic atrophy lesion enlargement than linear and effective-radius (square-root) models. We analyzed a retrospective, single-center cohort of 121 patients (181 eyes) with serial fundus autofluorescence imaging from October 2012 to April 2023, excluding eyes that had received prior geographic atrophy therapies or fewer than five gradable visits, creating a natural-history cohort. We fitted four candidate models (Gompertz, logistic, linear, and effective radius) within a hierarchical framework. We evaluated model accuracy using rolling out-of-sample forecasts, as assessed by continuous ranked probability scores. We assessed calibration by the prediction-interval width and coverage. The median follow-up was 5.8 years (IQR, 2.8 years), the mean age was 79.2 years (SD, 7.9 years), and 60% of the cohort were female. Gompertz achieved the lowest forecast error (0.45 mm2) versus logistic (0.48 mm2), linear (0.52 mm2), and effective radius (0.62 mm2), and received the highest pseudo-Bayesian model averaging weight (0.994). It yielded narrower 90% prediction intervals (2.41 mm2 vs. 3.99 mm2 for linear) and maintained these advantages at longer forecast horizons, where traditional models tended to overpredict. Differences were most pronounced during late (decelerating) growth. These findings demonstrate that Gompertz trajectories better capture lesion enlargement and modestly improve probabilistic forecasts compared with conventional approaches, supporting their use for patient counseling and for trial designs that account for natural growth deceleration.
Cooper, G.; Burke, J.; Hamid, C.; Godden, E.; Dhaun, N.; King, S.; MacGillivray, T. J.; Baillie, J. K.; Griffith, D.; MacCormick, I. J. C.
Show abstract
BackgroundShock involves microcirculatory dysfunction that is not suitably captured well by measurements of large vessels, such as systemic blood pressure. The outer retinal microcirculation (the choroid) can be measured non-invasively and may reflect dysfunction in other organs. We tested the feasibility of measuring the retinal choroid in an intensive care setting and explored associations between choroidal measurements and severity of disease. MethodsWe performed optical coherence tomography on patients admitted to the intensive treatment unit, and repeated imaging once 12-72 hours later. We measured choroidal anatomy using automated image segmentation, compared this to routine clinical data, and described change over time. ResultsOf fifteen patients recruited, 80% (12) had successful baseline imaging and 40% (6) of these had follow-up imaging within intensive care. At baseline, patients with thicker choroids and larger vascularity had larger cumulative fluid balance, and lower disease severity (Acute Physiology and Chronic Health Evaluation II) score, haematocrit, and albumin. A measurable suprachoroidal space was seen in 75% (9) patients and the size of this space tended to be larger in patients with lower heart rates. There was substantial intraindividual variation in choroidal measurements over time. CommentMeasuring the retinal choroid is feasible in patients with critical illness. Exploratory associations with systemic variables suggest that the choroid may provide information about the microvascular function of other major organs. Size and change of choroidal measurements may reflect perfusion pressure or vascular leak in response to inflammation.
Toral, M. A.; Ng, B.; Velez, G.; Yang, J.; Tsang, S. H.; Bassuk, A. G.; Mahajan, V. B.
Show abstract
PurposeAnti-vascular endothelial growth factor (anti-VEGF) therapy is the standard of care for neovascular age-related macular degeneration (AMD), yet many patients exhibit persistent retinal degeneration, fibrosis, and incomplete therapeutic response. The molecular pathways underlying this incomplete response remain poorly understood. We sought to identify VEGF-independent signaling pathways active in the vitreous of anti-VEGF-treated AMD patients. MethodsWe performed multiplex antibody-based proteomic profiling of 1,000 human proteins in vitreous samples from patients with neovascular AMD receiving anti-VEGF therapy (n=8) and comparative controls (n=6). Differential protein expression was assessed using one-way ANOVA, followed by gene ontology and pathway enrichment analyses. Drug-target relationships were evaluated to identify potential opportunities for therapeutic repositioning. ResultsWe identified 107 differentially expressed proteins (p<0.05), including key regulators of immune signaling, angiogenesis, and metabolism. Notably, multiple components of cytotoxic lymphocyte pathways were dysregulated, including IL-21R, SIGLEC-7, CTLA4, and IL-2-associated signaling. Enrichment analyses revealed significant activation of pathways related to T-cell activation, interleukin signaling, and leukocyte-mediated cytotoxicity. These immune signatures persisted despite suppression of VEGF signaling. Several clinically available immunomodulatory agents--including abatacept, sirolimus, and dupilumab--targeted pathways identified in this dataset. ConclusionsAnti-VEGF-treated neovascular AMD exhibits persistent cytotoxic immune signaling in the vitreous, suggesting that VEGF-independent immune mechanisms may contribute to ongoing retinal damage and incomplete therapeutic response. These findings provide a rationale for combination therapeutic strategies targeting both angiogenic and immune pathways in AMD.
Camacho-Garcia-Formenti, D.; Baylon-Vazquez, G.; Arriozola-Rodriguez, K. J.; Avalos-Ramirez, L. E.; Hartleben-Matkin, C.; Valdez Flores, H. F.; Hodelin-Fuentes, D.; Noriega Campero, A.
Show abstract
BackgroundArtificial intelligence (AI) shows promise in ophthalmology, but its potential on tertiary care settings in Latin America remains understudied. We evaluated a Mexican AI-powered screening tool, against first-year ophthalmology residents in a tertiary care setting in Mexico City. MethodsWe analysed 435 adult patients undergoing their first ophthalmic evaluation. AI and residents assessments were compared against expert annotations for retinal disease, cup-to-disk ratio (CDR) measurements, and glaucoma suspect classification. We also evaluated a synergistic approach combining AI and resident assessments. ResultsFor glaucoma suspect classification, AI outperformed residents in accuracy (88.6% vs 82.9%, p = 0.016), sensitivity (63.0% vs 50.0%, p = 0.116), and specificity (94.5% vs 90.5%, p = 0.062). The synergistic approach deemed a higher sensitivity (80.4%) than ophthalmic residents alone or AI alone (p < 0.001). AIs CDR estimates showed lower mean absolute error (0.056 vs 0.105, p < 0.001) and higher correlation with expert measurements (r = 0.728 vs r = 0.538). In retinal disease assessment, AI demonstrated higher sensitivity (90.1% vs 63.0% for medium/high-risk, p < 0.001) and specificity (95.8% vs 90.4%, p < 0.001). Furthermore, differences between AI and residents were statistically significant across all metrics. The synergistic approach achieved the highest sensitivity for retinal disease (92.6% for medium/high-risk, 100% for high-risk). ConclusionAI outperforms first-year residents in key ophthalmic assessments. The synergistic use of AI and resident assessments shows potential for optimizing diagnostic accuracy, highlighting the value of AI as a supportive tool in ophthalmic practice, especially for early-career clinicians.
Lin, J. B.; Mataraso, S. J.; Chadha, M.; Velez, G.; Mruthyunjaya, P.; Aghaeepour, N.; Mahajan, V. B.
Show abstract
PurposeThere is a need for novel therapies for diabetic retinopathy (DR) because existing therapies treat only certain features of DR and do not work optimally for all patients. While proteomic studies provide insight into disease pathobiology, they are often limited to small sample sizes due to high costs, limiting their generalizability and reproducibility. Moreover, they often yield lists of tens to hundreds of proteins with differential expression, making it difficult to prioritize the most biologically relevant biomarkers beyond using arbitrary fold-change and false-detection rate cutoffs. Here, we applied a two-stage multimodal AI approach: first, we integrated EHR and proteomics data to rationally prioritize candidate protein biomarkers and, next, validated these biomarkers in an independent cohort. These protein biomarkers of DR are rooted in the EHR data and thereby more likely to be biological drivers of disease. MethodsWe obtained EHR data from a large number of patients with and without DR (N=319,997) from the STARR-OMOP database and obtained aqueous humor liquid biopsies from a subset of these patients (N=101) for high-resolution proteomic profiling. We developed Clinical and Omics Multi-Modal Analysis Enhanced with Transfer Learning (COMET) to perform integrated analysis of proteomics and all available EHR data to identify protein biomarkers of DR. The model was trained in two phases: first, it was pretrained using patients with EHR data alone (N=319,896), and then, it was fine tuned using patients with both EHR and proteomics data (N=101), allowing it to learn both clinical and molecular features associated with DR. Findings from COMET were then validated with liquid biopsies from an independent, validation cohort (N=164). Resultst-distributed stochastic neighbor embedding (t-SNE) analysis of EHR and proteomics data identified proteins clustering with related EHR features. Levels of STX3 and NOTCH2, proteins involved in retinal function, were correlated with a diagnosis of macular edema, a record of a visual field exam, and a prescription for latanoprost, highlighting protein-EHR alignment. The pretrained, multimodal COMET model was superior (AUROC=0.98, AUPRC=0.91) compared to models generated using either EHR or proteomics data alone or without pretraining (AUROC: 0.76 to 0.92; AUPRC: 0.47 to 0.74). The proteins SERPINE1, QPCT, AKR1C2, IL2RB, and SRSF6 were prioritized by the COMET model compared to the models without pretraining, supporting their potential role in DR pathobiology, and were subsequently validated in an independent cohort. ConclusionWe used multimodal AI to prioritize protein biomarkers of DR that are most strongly linked to EHR elements, as well as identifying other protein biomarkers associated with disease features like diabetic macular edema. These findings serve as a foundation for future mechanistic studies and highlight the synergistic value of using multimodal AI to fuse EHR and proteomics data for enhanced proteomics analysis.
Takayama, T.; Uto, T.; Tsuge, T.; Kondo, Y.; Tampo, H.; Chiba, M.; Kaburaki, T.; Yanagi, Y.; Takahashi, H.
Show abstract
BackgroundRetinal breaks are critical lesions that can lead to retinal detachment and vision loss if not detected and treated early. Automated and precise delineation of retinal breaks using ultra- widefield fundus (UWF) images remain a significant challenge in ophthalmology. ObjectiveThis study aimed to develop and validate a deep learning model based on the PraNet architecture for the accurate delineation of retinal breaks in UWF images, with a particular focus on segmentation performance in retinal break-positive cases. MethodsWe developed a deep learning segmentation model based on the PraNet architecture. This study utilized a dataset consisting of 8,083 cases and a total of 34,867 UWF images. Of these, 960 images contained retinal breaks, while the remaining 33,907 images did not. The dataset was split into 34,713 images for training, 81 for validation, and 73 for testing. The model was trained and validated on this dataset. Model performance was evaluated using both image-wise segmentation metrics (accuracy, precision, recall, Intersection over Union (IoU), dice score, centroid distance score) and lesion-wise detection metrics (sensitivity, positive predictive value). ResultsThe PraNet-based model achieved an accuracy of 0.996, a precision of 0.635, a recall of 0.756, an IoU of 0.539, a dice score of 0.652, and a centroid distance score of 0.081 for pixel-level detection of retinal breaks. The lesion-wise sensitivity was calculated as 0.885, and the positive predictive value (PPV) was 0.742. ConclusionsTo our knowledge, this is the first study to present pixel-level localization of retinal breaks using deep learning on UWF images. Our findings demonstrate that the PraNet-based model provides precise and robust pixel-level segmentation of retinal breaks in UWF images. This approach offers a clinically applicable tool for the precise delineation of retinal breaks, with the potential to improve patient outcomes. Future work should focus on external validation across multiple institutions and integration of additional annotation strategies to further enhance model performance and generalizability.